Method and apparatus for encoding and decoding images

ABSTRACT

A method is disclosed for adaptive entropy coding and decoding of a stream of indices representative of information, including the following steps: establishing, at both an encoder and a decoder, respective coding tables to have codes corresponding to indices; finding, in the encoder table, codes corresponding to indices of the stream of indices, and sending the found codes to the decoder for recovery of the indices corresponding to the sent codes; determining an overall frequency of occurrence of individual indices in the stream of indices; determining a measure of how currently individual indices have occurred in the stream of indices; and adaptively modifying the encoder and decoder tables in accordance with the measure of how currently individual indices have occurred and the frequency of occurrence.

RELATED APPLICATION

This application claims priority from U.S. Provisional patentapplication Ser. No. 60/020,776, filed Jun. 28, 1996, and saidProvisional Patent Application is incorporated herein by reference.

FIELD OF THE INVENTION

This invention relates to compression of image-representative signalsand, more particularly, to a method and apparatus for encoding anddecoding image-representative signals.

BACKGROUND OF THE INVENTION

Image-representative signals (e.g. video signals) can be digitized,encoded, and subsequently decoded in a manner which substantiallyreduces the number of bits necessary to represent a decodedreconstructed image without undue or noticeable degradation in thereconstructed image. Coding methods that use transforms, for examplediscrete cosine transform ("DCT") or wavelet transforms, are well knownand in widespread use. Video compression standards such as JPEG, MPEG-1,MPEG-2, H.261 and H.262 typically employ DCT-based techniques.

Techniques employing vector transform (VT) coding (see, for example,U.S. Pat. No. 5,436,985) can provide substantial improvements in codingefficiency over DCT-based methods used in the above referencedstandards. In VT coding, an image (e.g. a video frame, a segmented videoframe, or a motion compensated difference frame) is sub-sampled intomultiple small images. Each small image is converted into a differentformat by using a transform such as the discrete cosine transform or awavelet transform. The corresponding transform coefficients from thesmall images are grouped together to form vectors. Vector quantizationis used to quantize and code those vectors.

Although techniques such as VT coding are advantageous, serviceproviders can be faced with the problem of needing to retain hardwarefor encoding DCT-based and wavelet-based signals to serve those userswho only have decoder equipment for such signals, while also investingin equipment that can encode the signals of more advanced techniquessuch as VT coding in order to serve those users who have the moreadvanced decoder equipment. The problem is analogous from a userstandpoint, where a user having only DCT-based and/or wavelet baseddecoder hardware will be limited in capability of decoding signalsencoded with more advanced techniques such as VT coding, whereaspurchasers of VT decoding equipment will also want to be able to decodethe signals encoded with DCT-based and wavelet-based encoders that willremain in use, but without having to purchase additional equipment fordoing so.

It is among the objects of the present invention to provide improvementsin encoding and decoding techniques and apparatus that are responsive tothe problems just summarized. It is also among the objects of theinvention to provide improved coding options and to provide an improvedtechnique and apparatus for entropy coding.

SUMMARY OF THE INVENTION

In a form of the present invention, a method is set forth for decodingan encoded signal that includes an encoded control portion and anencoded video portion, comprising the following steps: providing aplurality of inverse transform functions; decoding the encoded signal torecover said control portion; selecting one of said inverse transformfunctions in accordance with the recovered control portion; and decodingsaid encoded video portion with the selected inverse transform function.

In an embodiment of this form of the invention, the inverse transformfunctions comprise inverse digital cosine transform, inverse wavelettransform, and inverse vector transform. In this embodiment, the controlportion specifies the level of wavelet decomposition of wavelettransform and the subsampling factor of said vector transform.

Further features and advantages of the invention will become morereadily apparent from the following detailed description when taken inconjunction with the accompanying drawings.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a block diagram of an apparatus which can be used to practiceembodiments of the invention.

FIG. 2 is a flow diagram of a routine that can be utilized to programthe encoder processor in accordance with an embodiment of the invention.

FIG. 3 is a flow diagram of a routine that can be utilized to programthe decoder processor in accordance with an embodiment of the invention.

FIG. 4 is a flow diagram of an embodiment of a routine for adaptiveentropy coding.

FIG. 5 is a flow diagram of an embodiment of a routine for adaptiveentropy decoding.

DETAILED DESCRIPTION

Referring to FIG. 1, there is shown a block diagram of an apparatuswhich can be used in practicing embodiments of the invention forencoding and decoding images 100. A video camera 102, or other source ofvideo signal, produces an array of pixel-representative signals that arecoupled to an analog-to-digital converter 103, which is, in turn,coupled to the processor 110 of an encoder 105. When programmed in themanner to be described, the processor 110 and its associated circuitscan be used to implement embodiments of the invention. The processor 110may be any suitable processor, for example an electronic digitalprocessor or microprocessor. It will be understood that any generalpurpose or special purpose processor, or other machine or circuitry thatcan perform the functions described herein, electronically, optically,or by other means, can be utilized. The processor 110, which forpurposes of the particular described embodiments hereof can beconsidered as the processor or CPU of a general purpose electronicdigital computer, such as a Model Ultra-1 sold by Sun Microsystems,Inc., will typically include memories 123, clock and timing circuitry121, input/output functions 118 and monitor 125, which may all be ofconventional types. In the present embodiment blocks 131, 133 and 135represent functions that can be implemented in hardware, software, or acombination thereof. The block 131 represents a digital cosine transformfunction that can be implemented using commercially available DCT chipsor combinations of such chips with known software, and the block 133represents a wavelet transform that can be implemented usingcommercially available wavelet transform chips, or combinations of suchchips with known software. The block 135 represents a vector transformfunction that can be implemented in accordance with the routines setforth in U.S. Pat. No. 5,436,985 (incorporated herein by reference) orhardware equivalents. As described in said '985 Patent, vectorquantization (represented by block 136) is employed as part of the VTcoding. The vector quantization can be lattice VQ, for example of thetype described in copending U.S. patent application Ser. No. 08/733,849,filed Oct. 18, 1996, and copending U.S. patent application Ser. No.08/743,631, filed Nov. 4, 1996, both assigned to the same assignee asthe present application, and both incorporated herein by reference. Atransformed VQ (represented by block 137) is described hereinbelow.

With the processor appropriately programmed, as described hereinbelow,an encoded output signal 101 is produced which is a compressed versionof the input signal 90 and requires less bandwidth and/or less memoryfor storage. In the illustration of FIG. 1, the encoded signal 101 isshown as being coupled to a transmitter 135 for transmission over acommunications medium (e.g. air, cable, fiber optical link, microwavelink, etc.) 50 to a receiver 162. The encoded signal is also illustratedas being coupled to a storage medium 138, which may alternatively beassociated with or part of the processor subsystem 110, and which has anoutput that can be decoded using the decoder to be described.

Coupled with the receiver 162 is a decoder 155 that includes a similarprocessor 160 (which will preferably be a microprocessor in decoderequipment) and associated peripherals and circuits of similar type tothose described in the encoder. These include input/output circuitry164, memories 168, clock and timing circuitry 173, and a monitor 176that can display decoded video 100'. Also provided are blocks 181, 183and 185 that represent functions which (like their counterparts 131, 133and 135 in the encoder) can be implemented in hardware, software, or acombination thereof. The block 181 represents an inverse digital cosinetransform function that can be implemented using commercially availableIDCT chips or combinations of such chips with known software, and theblock 183 represents an inverse wavelet transform function that can beimplemented using commercially available inverse wavelet transformchips, or combinations of such chips with known software. The block 185represents an inverse vector transform function that can be implementedin accordance with the routines set forth in the above-referenced U.S.Pat. No. 5,436,985 or hardware equivalents. As described in said '985Patent, inverse vector quantization (represented by block 186) isemployed as part of the inverse VT coding. The inverse vectorquantization can be inverse lattice VQ, for example of the typedescribed in the above referenced copending U.S. patent application Ser.Nos. 08/733,849 now U.S. Pat. No. 5,883,981 and 08/743,631 now U.S. Pat.No. 5,940,542. An inverse transformed VQ (represented by the block 187)is described hereinbelow.

In order to provide a more universal approach to encoding/decodingwherein, for example in the present embodiment, VT coding is madecompatible with the DCT-based and wavelet-based compression techniques,three parameters are introduced and are described as follows:

Level of decomposition (LD):

This parameter takes an integer value from 0 to MAXLD and indicates thelevel of wavelet decomposition. When LD=0, it indicates that the DCT isused instead of a wavelet transform. For example, if the maximum levelof decomposition is chosen to be MAXLD=7, three bits are needed forcoding this parameter as follows:

    ______________________________________                                        LD value LD code    meaning                                                   ______________________________________                                        0        000        use the DCT                                               1        001        use 1 level of wavelet decomposition                      2        010        use 2 levels of wavelet decomposition                     3        011        use 3 levels of wavelet decomposition                     4        100        use 4 levels of wavelet decomposition                     5        101        use 5 levels of wavelet decomposition                     6        110        use 6 levels of wavelet decomposition                     7        111        use 7 levels of wavelet decomposition                     ______________________________________                                    

Factor of subsampling (FS):

This parameter takes an integer value from 0 to MAXFS. 2^(FS) indicatesthe factor of subsampling used for vector transform. When, FS=0, 2^(FS)=1 indicates no subsampling is performed. For example, if the maximum FSvalue is chosen to be MAXFS=7, three bits are needed for coding thisparameter as follows:

    ______________________________________                                        FS value Fs code     meaning                                                  ______________________________________                                        0        000         no subsampling                                           1        001         subsampling by a factor of 2                             2        010         subsampling by a factor of 4                             3        011         subsampling by a factor of 8                             4        100         subsampling by a factor of 16                            5        101         subsampling by a factor of 32                            6        110         subsampling by a factor of 64                            7        111         subsampling by a factor of 128                           ______________________________________                                    

Method of quantization (MQ):

In the present example, this parameter takes a value of either 0 or 1 asshown in the following table:

    ______________________________________                                        MQ value        meaning                                                       ______________________________________                                        0               use lattice VQ                                                1               use transformed lattice VQ                                    ______________________________________                                    

As an example, a description of an 8×8 transformed Z lattice vectorquantization (VQ) technique can be summarized as follows:

each 8×8 vector is transformed into a different coordinate system sothat the distribution boundary becomes rectangular. For example, an 8×8DCT transform can be used;

the transformed vector is quantized using a Z₆₄ lattice;

the coordinate values of the closest Z₆₄ lattice point is ordered into a1-D sequence according to a zig-zag scan;

the 1-D sequence is runlength and entropy coded;

the coded bitstream becomes the index of the 8×8 vector.

A combination of the three above-described parameters indicates aparticular coding method. For example, the following coding methods canbe covered:

DCT-based coding as used in the current standards:

Set LD=0, FS=0, and MQ=0. The DCT is used instead of wavelet becauseLD=0. No subsampling is performed because FS=0. Lattice VQ becomesuniform scalar quantization when vector dimension becomes 1. Therefore,MQ=0 means uniform scalar quantization when FS=0 (2^(FS) =1).

Wavelet-based coding:

The only difference between this case and the previous one is to setLD=a non-zero integer. For example, a 3-level wavelet decomposition plusuniform scalar quantization would have LD=3, FS=0, and MQ=0.

Vector wavelet coding using Λ₁₆ lattice VQ:

In this case, LD is still a non-zero integer and FS also becomes anon-zero integer. Because Λ₁₆ lattice VQ is used, FS should be set to 2so that subsampling of 4×4 is performed. For example, a 3-level vectorwavelet decomposition plus Λ₁₆ lattice VQ would have LD=3, FS=2, andMQ=0.

Vector wavelet coding using transformed Z lattice VQ:

This case is the same as the previous one except lattice VQ is replacedwith transformed lattice VQ. For example, a 3-level vector waveletdecomposition plus 8×8 transformed lattice VQ would have LD=3, FS=3, andMQ=1.

Vector DCT coding using transformed Z lattice VQ:

This case is the same as the previous one except wavelet is replacedwith DCT. For example, if an 8×8 transformed lattice VQ is still used,the parameters should be LD=0, FS=3; and MQ=1.

FIG. 2 is a flow diagram of a routine that can be utilized to programthe encoder processor in accordance with an embodiment of the invention.The block 203 represents the inputing of operator selected controlparameters, that is, the selected values of the parameters LD, FS, andMQ, as described above. A digital control word or signal, in this caseseven bits (3 bits for LD, 3 bits for FS, and 1 bit for MQ) is generatedas representing the control parameters (block 205). The control bits canthen be output (block 207), such as to an output register, for inclusionsuch as in the header of the bit stream.

Inquiry is then made (decision block 210) as to whether FS is 0. If so,in this example, vector transform (VT) coding is not being used, and theblock 215 is entered directly. If not, the block 212 is entered andsubsampling is implemented at the factor FS. Inquiry is then made(decision block 215) as to whether LD is 0. If so, digital cosinetransform (DCT) is being used and the block 217 is entered forimplementation of DCT. If not, the block 220 is entered, this blockrepresenting implementation of wavelet transform using a number oflevels of wavelet decomposition determined by LD.

Inquiry is then made (decision block 225) as to whether FS is 0. If so,as previously noted, VT coding is not being used, quantization (block227) and run length coding (block 228) are implemented and the block 260is then entered. If not, the block 230 is entered, this blockrepresenting vector grouping in accordance with FS. Inquiry is then made(decision block 240) as to whether MQ is 0. If so, lattice VQ isimplemented, as represented by the block 245. If not, transformedlattice VQ, which involves, in the context of vector transform, DCT ofthe grouped vectors (which have already been DCTed or wavelettransformed), followed by quantization (e.g. scalar quantization usingZ-lattice) and run length coding, these functions being represented bythe blocks 252, 255, and 257, respectively. Entropy coding can then beimplemented (block 260), followed by outputing of the bit stream, asrepresented by the block 270. In the present embodiment, adaptiveentropy coding is employed, as described in conjunction with the routineof FIG. 4.

Referring to FIG. 3, there is shown a flow diagram of an embodiment of aroutine that can be utilized to program the decoder processor inaccordance with an embodiment of the invention. The block 305 representsrecovering the control bits from the received data, and the block 310represents entropy decoding on the received bit stream. In an embodimenthereof, the entropy decoding can be adaptive entropy decoding asdescribed in conjunction with the routine illustrated in the flowdiagram of FIG. 5. Inquiry is made (decision block 313) as to whether FSis 0. If so, vector transform (VT) was not implemented at the encoder,runlength decoding (block 314) and inverse quantization (block 315) areimplemented and the block 350 is then entered. If not, the decisionblock 316 is entered and inquiry is made as to whether MQ is 0. If so,inverse lattice VQ is implemented, as represented by the block 320. Ifnot, an inverse of transformed lattice VQ is implemented as representedby the blocks 331, 334 and 336. In particular, these blocks are theinverse of the blocks 257, 255, and 252 of the encoder; namely, runlength decoding (block 331), inverse scalar quantization (block 334) andinverse DCT (block 336). Vector separation of the vector groups, inaccordance with FS, is then implemented, as represented by the block340.

Inquiry is then made (decision block 350) as to whether LD is 0. If so,inverse DCT is implemented, as represented by the block 355. If not,inverse wavelet transform is implemented, at a level determined by LD,as represented by the block 360. Inquiry is then made (decision block370) as to whether FS is 0. If so, VT has not be employed, and the block385 is entered directly. If not, the block 380 is entered, this blockrepresenting the interleaving (the inverse of subsampling) at a factordetermined by FS, whereupon the block 385 is entered. The block 385represents outputing of the now recovered data, such as for videodisplay and/or recording.

FIG. 4 is a flow diagram of a routine for controlling the encoderprocessor to implement the optional adaptive entropy encoding inaccordance with an embodiment of the invention. The block 402 representsmaking any desired initial settings, for example setting of initialmatched entry counts for data indices in an encoder table and alsosetting any initial non-access times (or cycle counts) for entries inthe table. Next, for a received index of the data stream to be encoded,the entropy coding table is searched (block 405), and determination ismade (block 407) as to whether a matched entry is found (i.e., whetherthere is a stored code for this index). If so, the code is output (block410) for ultimate receipt by the decoder. Then, the count of matchedentries for that code (or index) is increased by one (block 415) and thenon-access time (or cycle count) of the matched entry is set to zero(block 418). Also, the non-access time (or cycle count) of other entriesin the table are increased by one, as represented by the block 420. Thenext index of the input stream of indices is then awaited (block 475).

If a matched entry was not found, an escape code (which is apredetermined code that tells the decoder that the symbol will not befound in its table) is output (block 432), followed by outputing of theindex itself (block 435). The index is then entered into the table witha matched entry count of one and a non-access time (or cycle count) ofzero (block 438). The non-access time (or cycle count) of other entriesin the table are then incremented by one (block 440). Inquiry is thenmade as to whether the table size is greater than a predeterminedmaximum size (decision block 450). If not, the block 475 is entered andthe next index is awaited. If so, an entry is deleted from the table(block 460), namely, the entry with the largest non-access time (orcycle count). When there is more than one (that is, a tie), the one withthe smallest matched entry count is selected for deletion. The block 475is then entered.

FIG. 5 shows a flow diagram of a routine for programming the decoderprocessor to implement the optional adaptive entropy decoding. The block502 represents any necessary table initialization for correspondencewith the encoder coding table. As will be described further, theprocedure in the decoder maintains correspondence between the decodercoding table and the encoder coding table. A code word of the stream ofindex-representative code words is received, and determination is made(decision block 505) as to whether the code word is an escape code. Ifnot, the corresponding index is fetched from the table and output (block512). Next, operations are performed as represented by blocks 515, 518and 520, which respectively correspond to their counterpart blocks 415,418 and 420 of the encoder. Specifically, the count of the index entryis incremented (block 515), the non-access time (or cycles count) is setto zero (block 518), and the non-access time (or cycles count) of theother entries are increased by one (block 520). The next code word isthen awaited (block 525).

If it is determined that the received code word is an escape code, theindex that follows it is received and output (block 542). The index isthen entered in the table with a matched entry count of one and anon-access time (or cycles count) of zero (block 545). Then, thenon-access time (or cycles count) of the other entries of the table areincreased by one (block 548). Determination is then made (decision block550) as to whether the predetermined maximum table size has beenreached. If not, the block 525 is entered and the next code word isawaited. If so, an entry is deleted from the table in accordance withthe previously described deletion rules, as represented by the block560. The block 525 is then entered.

What is claimed is:
 1. A method for adaptive entropy coding and decodingof a stream of indices representative of information, comprising thesteps of:establishing, at both an encoder and a decoder, respectivecoding tables to have codes corresponding to indices; finding, in saidencoder table, codes corresponding to indices of said stream of indices,and sending the found codes to said decoder for recovery of the indicescorresponding to the sent codes; determining an overall frequency ofoccurrence of individual indices in said stream of indices; determininga measure of how currently individual indices have occurred in saidstream of indices; adaptively modifying said encoder and decoder tablesin accordance with said measure of how currently individual indices haveoccurred and the frequency of occurrence.
 2. The method as defined byclaim 1, wherein said step of determining a measure of how currentlyindividual indices have occurred in the stream of indices includesstoring the times since last occurrence of each index.
 3. The method asdefined by claim 2, wherein the stored time since last occurrence of agiven index is reset to zero when said index occurs in the stream ofindices.